Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](exec)lazy deserialize pblock in VDataStreamRecvr::SenderQueue #44378

Merged
merged 2 commits into from
Nov 26, 2024

Conversation

Mryange
Copy link
Contributor

@Mryange Mryange commented Nov 21, 2024

What problem does this PR solve?

Previously, for a pblock (serialized block), the block would be deserialized immediately
after receiving the RPC request and then placed into the data_queue.
This approach caused significant time consumption during RPC processing due to the
deserialization process, impacting overall performance.
The new approach defers deserialization until getBlock is called. This has the following advantages:

  1. Reduces time spent during the RPC handling phase.
  2. Memory allocation for deserialization happens within the execution thread, improving cache locality
    and reducing contention on memory resources.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Contributor Author

Mryange commented Nov 21, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -18,6 +18,7 @@
#pragma once

#include <gen_cpp/Types_types.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gen_cpp/Types_types.h' file not found [clang-diagnostic-error]

#include <gen_cpp/Types_types.h>
         ^

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.03% (9902/26040)
Line Coverage: 29.22% (82834/283522)
Region Coverage: 28.35% (42544/150080)
Branch Coverage: 24.89% (21557/86596)
Coverage Report: http://coverage.selectdb-in.cc/coverage/fb54dcd43abf1465e158ec4617e8f9f87af72e51_fb54dcd43abf1465e158ec4617e8f9f87af72e51/report/index.html

@Mryange
Copy link
Contributor Author

Mryange commented Nov 21, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.03% (9902/26038)
Line Coverage: 29.21% (82811/283515)
Region Coverage: 28.35% (42548/150077)
Branch Coverage: 24.91% (21566/86590)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6c96e8d5389757921d70937a3c8652f586a660ea_6c96e8d5389757921d70937a3c8652f586a660ea/report/index.html

@Mryange Mryange changed the title [only test] [only test lazy-deserialize] Nov 24, 2024
@Mryange
Copy link
Contributor Author

Mryange commented Nov 24, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.30% (9980/26056)
Line Coverage: 29.44% (83555/283844)
Region Coverage: 28.59% (42983/150358)
Branch Coverage: 25.17% (21838/86746)
Coverage Report: http://coverage.selectdb-in.cc/coverage/291194a34d8a6f350bd5c97facadc8002dce1da5_291194a34d8a6f350bd5c97facadc8002dce1da5/report/index.html

@Mryange Mryange changed the title [only test lazy-deserialize] [opt](exec)lazy deserialize pblock in VDataStreamRecvr::SenderQueue Nov 25, 2024
@Mryange
Copy link
Contributor Author

Mryange commented Nov 25, 2024

run buildall

@Mryange Mryange marked this pull request as ready for review November 25, 2024 03:15
_recvr->_parent->memory_used_counter()->update(-(int64_t)block_byte_size);
std::lock_guard<std::mutex> l(_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rethink the logic

@Mryange
Copy link
Contributor Author

Mryange commented Nov 25, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.31% (9981/26056)
Line Coverage: 29.44% (83560/283849)
Region Coverage: 28.60% (42995/150358)
Branch Coverage: 25.18% (21841/86746)
Coverage Report: http://coverage.selectdb-in.cc/coverage/9e8a45e697665c3499b93bcd735ec3e1886dac55_9e8a45e697665c3499b93bcd735ec3e1886dac55/report/index.html

@Mryange
Copy link
Contributor Author

Mryange commented Nov 25, 2024

run p0

1 similar comment
@Mryange
Copy link
Contributor Author

Mryange commented Nov 25, 2024

run p0

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 25, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@HappenLee HappenLee merged commit 378b2f2 into apache:master Nov 26, 2024
26 of 28 checks passed
Mryange added a commit to Mryange/doris that referenced this pull request Jan 10, 2025
…pache#44378)

Previously, for a `pblock` (serialized block), the block would be
deserialized immediately
after receiving the RPC request and then placed into the `data_queue`.
This approach caused significant time consumption during RPC processing
due to the
deserialization process, impacting overall performance.
The new approach defers deserialization until `getBlock` is called. This
has the following advantages:
1. Reduces time spent during the RPC handling phase.
2. Memory allocation for deserialization happens within the execution
thread, improving cache locality
   and reducing contention on memory resources.

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [x] No need to test or manual test. Explain why:
- [x] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants